Data Sets
-
AO-CHILDES
5M words of American-English child-directed speech ordered by the age of the child.
[more info] -
SRL-CHILDES
200K American-English utterances from the CHILDES database annotated with semantic role labels using the Propbank formalism.
[more info] -
AO-CHILDES Relations
Word pairs in AO-CHILDES organized by semantic-relation (e.g. the pair cat-animal is an instance of hypernymy).
[more info] -
MissingAdjunct
A corpus of pseudo-English sentences with experimenter-controlled statistics for evaluating the ability of distributional models to infer the most plausible adjunct of a phrase.
[more info]